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Executive Summary 



This report presents findings from the Enhanced Reading Opportunities (ERO) study — 
a demonstration and rigorous evaluation of two supplemental literacy programs that aim to im- 
prove the reading comprehension skills and school performance of sfruggling ninth-grade read- 
ers. The U.S. Department of Education’s (ED) Office of Elementary and Secondary Education 
(OESE)‘ is funding the implementation of these programs, and its Institute of Education 
Sciences (lES) is responsible for oversight of the evaluation. MDRC — a nonprofit, nonpartisan 
education and social policy research organization — is conducting the evaluation in partnership 
with the American Institutes for Research (AIR) and Survey Research Management (SRM). 

The present report — the second of three — focuses on the second of two cohorts of 
ninth-grade students to participate in the study and discusses the impact that the two interven- 
tions had on these students’ reading comprehension skills through the end of their ninth-grade 
year. The report also describes the implementation of the programs during the second year of 
the study and provides an assessment of the overall fidelity with which the participating schools 
adhered to the program design as specified by the developers. While this report focuses primari- 
ly on implementation and impacts in the second year of the study, comparisons between the first 
and second year of the study are also provided.^ The key findings discussed in the report include 
the following: 

• On average, across the 34 participating high schools, the supplemental 
literacy programs improved student reading comprehension test scores 
by 0.08 standard deviation. This represents a statistically significant im- 
provement in students’ reading comprehension (p-value = 0.042). 

• Seventy-seven percent of the students who enrolled in the ERO classes in 
the second year of the study were still reading at two or more years be- 
low grade level at the end of ninth grade, relative to the expected read- 
ing achievement of a nationally representative sample of ninth-grade 
students.^ One of the two interventions — Reading Apprenticeship Aca- 



*The implementation was initially funded by the Office of Vocational and Adult Education (OVAE), but 
this role was later transferred to OESE. 

^ James J. Kemple, William Corrin, Elizabeth Nelson, Terry Salinger, Suzarmah Herrmann, and Kathryn 
Drummond, The Enhanced Reading Opportunities Study: Early Impacts and Implementation Findings, NCEE 
2008-4015 (Washington, DC:, U.S. Department of Education, Institute of Education Sciences, National Center 
for Education Evaluation and Regional Assistance, 2008). 

^Forty percent of ninth-graders nationally would be expected to score at two or more years below grade 
level on the same assessment. 
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demic Literacy (RAAL) — had a positive and statistically significant 
impact on reading comprehension test scores (0.14 standard deviation; 
p-value = 0.015). Although not statistically significant, a positive impact 
on reading comprehension (0.02 standard deviation) was also produced 
by the other intervention, Xtreme Reading. The difference in impacts 
between the two programs is not statistically significant, and thus it can- 
not be concluded that RAAL had a different effect on reading compre- 
hension than Xtreme Reading.'* 

• The overall impact of the ERO programs on reading comprehension test 
scores in the second year of implementation (0.08 standard deviation) is 
not statistically different from their impact in the first year of implemen- 
tation (0.09 standard deviation), nor is each intervention’s impact in the 
second year of implementation statistically different from its impact in 
the first year. 

• The implementation fidelity of the ERO programs was more highly 
rated in the second year of the study than in the first year. In compari- 
son with the first year, a greater number of schools in the second year of 
the study were deemed to have programs that were well aligned with the 
program developers’ specifications for implementation fidelity (26 
schools in the second year, compared with 16 schools in the first year), 
and fewer schools were considered to be poorly aligned (one school in 
the second year, compared with 10 schools in the first year). 



'*It is important to note that the ERO study is an evaluation of a class of reading interventions, as 
represented by Xtreme Reading and RAAL, as well as an evaluation of each of these two programs separately. 
The purpose of the study is not to test the differential impact of these two interventions; while Xtreme Reading 
and RAAL do differ in some respects, they are both full-year supplemental literacy courses targeted at strug- 
gling adolescent readers that share many common principles, and hence there was no prior expectation that 
they would produce substantially different impacts. As noted below, the design of the study is such that pro- 
grams are randomized to schools; however, the purpose of this randomization was to ensure that each program 
developer was assigned a fair draw of schools in which to implement its program, rather than to test for a diffe- 
rential impact between the two interventions. By this token, the statistical model chosen for the impact analysis 
does not utilize the school-level randomization feature of the research design; nor is the sample size large 
enough to detect policy-relevant differences in impacts across the two programs. Because Xtreme Reading and 
RAAL represent the same type of intervention, this study was designed to test their joint or overall impact. 
Statistical tests were used to confirm that the difference in impacts between the two programs is not statistical- 
ly significant and, hence, that it is indeed appropriate to pool together the two program-specific impact esti- 
mates; these statistical tests are not appropriate for making irrferences about the tme difference in impacts be- 
tween the two interventions. 
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The Supplemental Literacy Interventions 

The ERO study is a test of supplemental literacy interventions that are designed as full- 
year courses and targeted to students whose reading skills are two or more years below grade 
level as they enter high school. Two programs — Reading Apprenticeship Academic Literacy 
(RAAL), designed by WestEd, and Xtreme Reading, designed by the University of Kansas 
Center for Research on Learning — were selected for the study from a pool of 17 applicants by 
a national panel of experts on adolescent literacy. To qualify for the project, the programs were 
required to focus instmction in the following areas: (1) student motivation and engagement; (2) 
reading fluency, or the ability to read quickly, accurately, and with appropriate expression; (3) 
vocabulary, or word knowledge; (4) comprehension, or making meaning from text; (5) phonics 
and phonemic awareness (for students who could still benefit from instmction in these areas); 
and (6) writing. The overarching goals of both programs are to help ninth-grade students adopt 
the strategies and routines used by proficient readers, improve their comprehension skills, and 
be motivated to read more and to enjoy reading. Both programs are supplemental in that they 
consist of a yearlong course that replaces a ninth-grade elective class, rather than a core academ- 
ic class, and in that they are offered in addition to students’ regular English language arts 
classes. 



The primary differences between the two literacy interventions selected for the ERO 
study lie in their approach to implementation. Implementation of RAAL is guided by the con- 
cept of “flexible fidelity” — that is, while the program includes a detailed curriculum, the 
teachers are trained to adapt their lessons to meet the needs of their students and to supplement 
program materials with readings that are motivating to their classes. Teachers have flexibility in 
how they include various aspects of the RAAL curriculum in their day-to-day teaching activi- 
ties, but they have been trained to do so such that they maintain the overarching spirit, themes, 
and goals of the program in their instruction. 

Implementation of Xtreme Reading is guided by the philosophy that the presentation 
of instructional material — particularly the order and timing with which the lessons are pre- 
sented — is of critical import to students’ understanding of the strategies and skills being 
taught. As such, teachers are trained to deliver course content and materials in a precise, orga- 
nized, and systematic fashion designed by the developers. Xtreme Reading teachers follow a 
prescribed implementation plan, following specific day-by-day lesson plans in which activities 
have allotted segments of time within each class period. Teachers also use responsive instmc- 
tional practices to adapt and adjust to student needs that arise as they move through the highly 
sfructured curriculum. 
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Overview of the Study 



Interventions. Reading Apprenticeship Academic Literacy (RAAL) and Xtreme Reading — 
supplemental literacy programs designed as full-year courses to replace a ninth-grade elective 
class. The programs were selected through a competitive applications process based on ratings by 
an expert panel. 

Study sample. Two cohorts of ninth-grade students from 34 high schools and 10 school districts 
(2,916 students in Cohort 1 and 2,679 students in Cohort 2). Districts and schools were selected 
by ED’s Office of Vocational and Adult Education through a special Small Learning Communi- 
ties grant competition. Students were selected based on reading comprehension test scores that 
were between two and five years below grade level. 

Research design. Wifhin each district, high schools were randomly assigned to use either the 
RAAL program or the Xtreme Reading program during two school years (2005-2006 and 2006- 
2007). Within each high school, students were randomly assigned to enroll in the ERO class or to 
remain in a regularly scheduled elective class. A reading comprehension test and a survey were 
administered to students in the spring of eighth grade or at the start of ninth grade, prior to random 
assignment, and again at the end of ninth grade. Classroom observations in the first and second 
semester of the school year were used to measure implementation fidelity. 

Outcomes. Reading comprehension and vocabulary test scores, reading behaviors, student atten- 
dance in the ERO classes and other literacy support services, implementation fidelity. 



The ERO Evaluation 

The supplemental literacy programs were implemented in 34 high schools from 10 
school districts across the country. The districts were selected through a special grant competi- 
tion organized by the U.S. Department of Education’s Office of Vocational and Adult Educa- 
tion (OVAE). Experienced, full-time English/language arts or social studies teachers were self- 
selected and approved by ED, the districts, and the schools to teach the programs for a period of 
two years. 

The ERO evaluation utilizes a two-level random assignment research design. First, 
within each district, eligible high schools were randomly assigned prior to the first year of 
program implementation to use one of the two supplemental literacy programs: 17 of the high 
schools were assigned to use RAAE, and 17 schools were selected to use Xtreme Reading. 
Each school implemented the same program in two school years: 2005-2006 and 2006-2007. 
In the second stage of the study design, eligible students within each of the participating high 
schools and in each year of the study were randomly assigned either to enroll in the ERO class 
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(the “ERO group”) or to take one of their school’s regularly offered elective classes (the “non- 
ERO group”). 

During the second year of the study, the participating high schools identified 2,679 
ninth-grade students with baseline test scores indicating that they were reading two to five 
years below grade level (an average of 79 students per school). Approximately 57 percent of 
these students were randomly assigned to enroll in the ERO class, and the remaining students 
make up the study’s control group and were enrolled in or continued in a regularly scheduled 
elective class. 

Evaluation data were collected with the Group Reading Assessment and Diagnostic 
Examination (GRADE) reading comprehension and vocabulary tests and a survey.^ Both in- 
stmments were administered to students at two points in time: a baseline assessment and survey 
in the spring of eighth grade and a follow-up assessment and survey at the end of ninth grade.*’ 
Follow-up test scores are available for 2,171 (81 percent) of the students in the study sample. 
To learn about the fidelity of program implementation, the study also includes observations of 
the supplemental literacy classes during the first and second semester of the school year. 



Second-Year Implementation 

Each ERO teacher (one per school) was responsible for teaching four sections of the 
ERO class. Each section accommodated between 10 and 15 students. Classes were designed to 
meet for a minimum of 225 minutes per week and were scheduled as a 45-minute class every 
day or as a 75- to 90-minute class that met every other day. 

• Of the 34 teachers who participated in the second year of the study, 25 
had taught the entire first year of the study, and two had taught a por- 
tion of the first year (having replaced a teacher midyear). Seven teachers 
were new to the ERO programs at the start of the second year. 

During the second year of the project, the developers for each of the ERO programs 
provided three types of training and technical assistance to both new and returning ERO teach- 
ers: a three-day summer training institute in July or August 2006, booster training sessions dur- 
ing the 2006-2007 school year, and three 2-day coaching visits during the 2006-2007 school 
year. Prior to the summer institute, teachers new to the ERO programs also attended additional 



^American Guidance Service, Group Reading Assessment and Diagnostic Evaluation: Teacher’s Scoring 
and Interpretive Manual, Level H; and Technical Manual (Circle Pines, MN: American Guidance Service, 
2001a, 2001b). 

*’In four of the 34 participating schools, baseline testing occurred in the fall of ninth grade rather than the 
spring of eighth grade. 



5 




training sessions at which they were taught the central strategies of the program being imple- 
mented in their school. 

The study team assessed the overall fidelity with which the ERO programs were im- 
plemented in each school during the second year of the project. In the context of this study, “fi- 
delity” refers to the degree to which the observed operation of the ERO program in a given high 
school was aligned with the intended learning environment and instmctional practices that were 
specified by the model’s developers. The analysis of implementation fidelity in the second year 
of the study is based on two field research visits to each of the 34 high schools — one during 
the first semester and one during the second semester of the 2006-2007 school year. The class- 
room observation protocols used in the site visits provided a stmctured process for observers to 
rate the characteristics of the ERO classroom learning environments and the use of ERO in- 
stmctional strategies by teachers. The instmment included ratings for six characteristics (re- 
ferred to as “constructs” from here forward) that are common to both programs, as well as rat- 
ings for seven program-specific constmcts. For each construct, a category rating of 1 (“poorly 
aligned”), 2 (“moderately aligned”), or 3 (“well aligned”) was given. 

The analysis of the classroom observation ratings sought to capture implementation fi- 
delity on two key overarching dimensions of both programs: the classroom learning environ- 
ment and the teacher’s use of instmctional strategies focused on reading comprehension. A 
composite measure of implementation fidelity was calculated for each of these two dimensions 
by averaging across the relevant characteristics in the observation protocol. A composite rating 
of 2.0 or higher indicates that the school’s ERO program was well aligned with the developers’ 
implementation specifications; a rating of 1.5 to 1.9 means that the program was moderately 
aligned; and a rating of 1.0 to 1.4 means that it was poorly aligned. Following is a summary of 
key findings. 

• At the spring site visit, implementation fidelity in 26 of the 34 schools 
was classified as well aligned on both program dimensions. In seven 
schools, implementation was classified as moderately aligned with the 
program model on at least one of the two key program dimensions and 
as moderately or well aligned on the other dimension. In one school, im- 
plementation was deemed to be poorly aligned with the program mod- 
els. 

The overall implementation of the ERO program in a given school was classified as 
well aligned if both the classroom environment and the comprehension instmetion dimension 
were rated as being well aligned. According to the protocols used for the classroom observa- 
tions, teacher behaviors and classroom activities in these schools were consistently rated as be- 
ing well developed and reflective of the behaviors and activities specified by the developers. At 
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the fall site visit, the implementation of the ERO programs in 20 of the 34 schools was classi- 
fied as well aligned on both program dimensions, and, at the spring site visit, 26 schools had 
attained this benchmark. Because implementation fidelity in the majority of the study schools 
was deemed to be well aligned to the models, the study team also examined the number of 
schools whose implementation of the programs was “very well aligned” to developers’ specifi- 
cations (defined here as a composite score of 2.5 or higher on both program dimensions). At the 
spring site visit, implementation in 13 schools could be classified as such. 

Conversely, a school’s overall implementation fidelity was judged to be poorly aligned 
with the program model if the composite rating for either the classroom learning environment 
dimension or the comprehension instmction dimension was rated as poorly aligned. The ERO 
programs in these schools were not representative of the activities and practices intended by the 
respective program developers and were found to have encountered serious implementation 
problems on at least one of the two key program dimensions during the second year of the 
study.^ At the fall site visit, implementation of the ERO programs in three of the 34 schools was 
classified as poorly aligned with the program models on at least one of the two program dimen- 
sions. At the spring site visit, implementation at one school was considered to be poorly aligned 
with the program models.* 

• The number of schools considered to be well aligned with the program 
developers’ specifications for implementation fidelity was greater in the 
second year of the study than in the first year (26 schools in the second 
year, compared with 16 schools in the first year). 

At the spring site visit in the second year of the study, the ERO programs in 33 of the 
34 schools reached an overall level of implementation fidelity that was at least moderately 
aligned to the program models (of these, 26 were considered to be well aligned). This is an im- 
provement over the first year of the study, when 24 of the 34 schools had reached a moderate 
level of alignment at the spring site visit (of these, 16 schools were deemed to be well aligned). 
Also, during the spring site visit of the second year, only one school’s implementation of the 
program was poorly aligned to the developers’ specifications. This is lower than what was 
found during the first-year spring site visit, when 10 schools were ranked as poorly aligned on at 
least one of the two key program dimensions. 



^In particular, poorly aligned implementation for a given dimension means that the classroom observers 
found that at least half of the classroom characteristics were not aligned with the behaviors and activities speci- 
fied by the developers and described in the protocols. 

*In the second year of the study, implementation-fidelity ratings were similar for the 25 schools where the 
ERO teacher taught two full years of the program and for the nine schools where the ERO teacher had replaced 
another teacher at some point during the study (an average rating of 2.5 for returning teachers and 2.4 for re- 
placement teachers, out of a maximum of score 3). 
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student Enrollment and Attendance in the ERO Classes and 
Participation in Literacy Support Activities 

The study team collected data on the duration of the ERO classes as well as the fre- 
quency with which students attended the ERO classes and participated in other classes or tutor- 
ing services that aimed to improve their reading and writing skills. 

ERO classes in the second year began an average of 2.3 weeks after the start of the 
school year and operated for an average of nine months. Eighteen schools started the ERO pro- 
gram on the first day of school, and five more schools started within the first two weeks that 
classes were in session. The remaining eleven started their ERO programs an average of seven 
weeks after the start of the school year. Among the students randomly assigned to the ERO 
group, 91 percent enrolled in the ERO classes, and 87 percent were still attending the classes at 
the end of the school year. 

• Students in the ERO group attended 79 percent of the scheduled ERO 
classes, and they received an average of 11 hours of ERO instruction per 
month. 

• Students who were randomly assigned to the study’s ERO group re- 
ported a higher frequency of participation in supplemental literacy ser- 
vices than students who were assigned to the non-ERO group. 

The ERO classes served as the primary source of literacy support services for students 
in the study sample. Although the largest difference in the use of supplemental literacy supports 
between the study’s ERO and non-ERO groups occurred in students’ participation in a supple- 
mentary school-based literacy class (an average of 75 yearly sessions for ERO students and 17 
yearly sessions for non-ERO students), ERO students were also significantly more likely to re- 
port working with a tutor in school (an average of 30 yearly sessions, compared with 12 yearly 
sessions for non-ERO students). 



Impact Findings 

The GRADE assessment was used to measure students’ reading achievement prior to 
random assignment (at “baseline”) and then again in the spring at the end of their ninth-grade 
year (at “follow-up”). The GRADE is a norm-referenced, research-based reading assessment 
that is used widely to measure performance and track the growth of an individual student and 
groups of students. Because the two ERO programs focus primarily on helping students use 
contextual clues to understand the meaning of words, the reading comprehension subtest of the 
GRADE is the primary measure of reading achievement in this study, while the GRADE voca- 
bulary subtest is a secondary indicator of the programs’ effectiveness. Performance levels and 




impacts on both subtests are presented in standard score units; students with a standard score of 
100 points are considered to be reading at grade level.'* 

Following is a summary of the study’s impact findings. 

• When analyzed jointly, the ERO programs produced an increase of 0.8 
standard score point on the GRADE reading comprehension subtests. 

This corresponds to an effect size of 0.08 standard deviation and is sta- 
tistically significant. The overall impact of the programs in the second 
year of implementation is not statistically different from their overall 
impact in the first year of implementation (0.09 standard deviation). 

The top panel of Table ES.l shows the impacts on spring follow-up reading compre- 
hension and vocabulary test scores across all 34 participating high schools in the second year of 
the study. The first row of data in the table shows that, on average, the reading comprehension 
test scores of students in the ERO group are 0.8 standard score point higher than the scores of 
students in the non-ERO group, which represents a statistically significant impact (its p-value is 
less than or equal to 5 percent).'” Expressed as a proportion of the overall variability of test 
scores for students in the non-ERO group, this estimated impact represents an effect size of 0.08 
(or 8 percent of the standard deviation of the non-ERO group’s test scores). 

Figure ES.l places this impact estimate in the context of the actual and expected change 
in the ERO students’ reading comprehension test scores on the GRADE from the beginning of 
ninth grade to the end of ninth grade. The bottom section of the bar shows that students in the 
ERO group achieved an average standard score of 84.6 at the start of their ninth-grade year. 
This corresponds, approximately, to a grade equivalent of 4.9 (the last month of fourth grade) 
and indicates an average reading level at the 14th percentile for ninth-grade students nationally. 

The middle section of the bar shows the estimated growth in test scores experienced by 
the non-ERO group. At the end of the ninth-grade year, the non-ERO group was estimated to 
have achieved an average standard score of 89.3, which corresponds to a grade equivalent of 6.0 
and an average reading level at the 23rd percentile for ninth-grade students nationally. This 

’Based on the national norms used to caleulate these seores, a standard seore of 100 on the GRADE read- 
ing eomprehension or voeabulary test is average for a representahve group of students at the end of their ninth- 
grade year. The standard deviation of the standard seore for both tests is 15. 

'"The impaet estimates in Table ES. 1 are regression-adjusted using ordinary least squares (OLS), control- 
ling for blocking of random assignment by school and for random differences between the ERO and non-ERO 
groups in their baseline reading comprehension test scores and age at random assignment. The values in the 
column labeled “ERO Group” are the observed means for students randomly assigned to the ERO group. The 
“Non-ERO Group” values in the next column are the regression-adjusted means for students randomly as- 
signed to the non-ERO group, using the observed mean covariate values for the ERO group as the basis for the 
adjustment. 
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The Enhanced Reading Opportunities Study 
Table ES.l 



Impacts on Reading Achievement, 
Cohort 2 Follow-Up Respondent Sample 











Estimated 


P-Value for 






Non-ERO 


Estimated 


Impact 


Estimated 


Outcome 


ERO 


Group 


Impact 


Effect Size 


Impact 


All schools 












Reading comprehension 












Average standard score 


90.1 


89.3 


0.8 * 


0.08 * 


0.042 


Corresponding grade equivalent 


6.1 


6.0 








Corresponding percentile 


25 


23 








Reading vocabulary 












Average standard score 


93.5 


93.5 


0.0 


0.00 


0.986 


Corresponding grade equivalent 


7.8 


7.8 








Corresponding percentile 


32 


32 








Sample size 


1,264 


907 








Reading AnDrenticeshin Academic Literacy schools 










Reading comprehension 












Average standard score 


90.2 


88.9 


1.4 * 


0.14 * 


0.015 


Corresponding grade equivalent 


6.1 


5.9 








Corresponding percentile 


25 


23 








Reading vocabulary 












Average standard score 


93.4 


93.8 


-0.4 


-0.04 


0.428 


Corresponding grade equivalent 


7.7 


7.8 








Corresponding percentile 


32 


33 








Sample size 


645 


470 








Xtreme Reading schools 












Reading comprehension 












Average standard score 


90.0 


89.7 


0.2 


0.02 


0.672 


Corresponding grade equivalent 


6.1 


6.0 








Corresponding percentile 


25 


24 








Reading vocabulary 












Average standard score 


93.5 


93.1 


0.4 


0.04 


0.468 


Corresponding grade equivalent 


7.8 


7.7 








Corresponding percentile 


32 


31 








Sample size 


619 


437 









(continued) 
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Table ES.l (continued) 



SOURCE: MDRC calculations from the Enhanced Reading Opportunities Study follow-up GRADE 
assessment. 

NOTES: The follow-up GRADE assessment was administered in the spring of 2007 near the end of 
students’ ninth-grade year. 

The estimated impacts are regression-adjusted using ordinary least squares, controlling for blocking of 
random assignment by school and for random differences between the ERO and non-ERO groups in their 
baseline reading comprehension test scores and age at random assignment. The values in the column labeled 
“ERO Group” are the observed means for students randomly assigned to the ERO group. The “Non-ERO 
Group” values in the next column are the regression-adjusted means for students randomly assigned to the 
non-ERO group, using the observed mean covariate values for the ERO group as the basis for the 
adjustment. 

The national average for standard score values is 100, and its standard deviation is 15. The grade 
equivalent and percentile are those associated with the average standard score as indicated in the GRADE 
Teacher's Scoring and Interpretive Manual (Level H, Grade 9, Spring Testing, Form B). No statistical tests 
or arithmetic operations were performed on these reference points. 

The estimated impact effect size is calculated as a proportion of the standard deviation of the non-ERO 
group average (reading comprehension = 10.035; reading vocabulary = 9.827). 

A two-tailed t-test was applied to the impact estimate. The statistical significance is indicated (*) when 
the p-value is less than or equal to 5 percent. 

Rounding may cause slight discrepancies in calculating sums and differences. 



growth of 4.7 standard score points for the non-ERO group provides the best indication of what 
the ERO group would have achieved during their ninth-grade year had they not had the oppor- 
tunity to attend the ERO classes. 

The top section of the bar shows the estimated impact of the ERO programs on reading 
comprehension test scores. At the end of the ninth-grade year, the ERO group achieved an aver- 
age standard score of 90. 1 , which corresponds to a grade equivalent of 6. 1 and an average read- 
ing level at the 25th percentile for ninth-grade students nationally. This means that the ERO 
group experienced a growth of 5.5 points in their reading comprehension skills over the course 
of ninth grade, which is 0.8 point higher than the growth achieved by the non-ERO group. Thus, 
the impact of the ERO programs (0.8 standard score point) represents a 17 percent improvement 
over and above the growth that the ERO group would have experienced if they had not had the 
opportunity to attend the ERO classes (4.7 points)." 

The solid line at the top of Figure ES.l shows the national average (100 standard score 
points) for students at the end of ninth grade, in the spring. Students scoring at this level are 
considered to be reading at grade level. Thus, the ERO group’s reading comprehension scores 



"The value of 17 pereent was ealeulated by dividing the impaet (0.8 standard seore point) by the average 
improvement of the non-ERO group (4.7 standard seore points). 
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The Enhanced Reading Opportnnities Stndy 
Figure ES.l 

Impacts on Reading Comprehension, 
Cohort 2 Follow-Up Respondent Sample 
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ERO group 
mean at 
baseline: 
84.6 



National average at spring of 9th grade: 100 



Estimated 
> impact = 0.8* 



Estimated 
growth for non- 
ERO group: 
4.7 




All Schools 
(n = 2,171 students) 

SOURCES: MDRC calculations from the Enhanced Reading Opportunities Study baseline and follow-up 
GRADE assessments. 



NOTES: The baseline GRADE assessment was administered in the fall of 2006 at the start of students’ 
ninth-grade year and prior to their random assignment to the ERO and non-ERO groups. The follow-up 
GRADE assessment was administered in the spring of 2007 near the end of students’ ninth-grade year. 

The ERO group growth at follow-up is calculated as the difference between the unadjusted ERO 
group mean at baseline and the unadjusted ERO group mean at follow-up. The impact was estimated 
using ordinary least squares and adjusted to account for the blocking of random assignment by school 
and to control for random differences between the ERO and non-ERO groups in baseline reading 
comprehension test scores and age at random assignment. The expected ERO group growth at follow-up 
is the difference between the actual ERO group growth and the impact. 

A two-tailed t-test was applied to the impact estimate. The statistical significance is indicated (*) 
when the p-value is less than or equal to 5 percent. 

The national average for standard score values is 100, and its standard deviation is 15. 

Rounding may cause slight discrepancies in calculating sums and differences. 
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still lagged nearly 10 points below the national average. In fact, 77 percent of students who par- 
ticipated in the ERO classes scored two or more years below grade level at the end of their 
ninth-grade year,'^ which means that they would still be eligible for the ERO programs were 
these programs again made available to them.'^ 

• The RAAL program increased students’ reading comprehension test 
scores by a statistically significant amount (0.14 standard deviation). Al- 
though not statistically significant, an impact of 0.2 standard score point 
on reading comprehension (0.02 standard deviation) was produced by the 
Xtreme Reading program. The difference in impacts between the two 
programs is not statistically significant, and thus it cannot be concluded 
that RAAL had a different effect than Xtreme Reading. Nor is there a 
statistically significant difference between each program’s impact in the 
second year of implementation and its impact in the first year of imple- 
mentation. 

The ERO student follow-up survey was administered to students at the same time as the 
follow-up GRADE assessment and includes additional information on students’ reading beha- 
viors and attitudes. Responses to the follow-up survey were used to derive measures for three 
reading behaviors that are intended to be affected by the ERO programs: the number of times 
during the prior month that a student read different types of text in school or for homework, the 
number of times during the prior month that a student read different types of text outside of 
school, and students’ reported use of the reading strategies and techniques that the ERO pro- 
grams try to teach. The overall impact of the programs on students’ reading behaviors is not 
statistically significant. 



The Relationship Between Impacts and Second-Year 
Implementation 

This report also includes an exploratory analysis that investigates the relationship be- 
tween school-level impacts and various aspects of implementation in the second year of the 



'^Forty percent of ninth-graders nationally would be expected to score two years or more below grade lev- 
el on the GRADE administered in the spring of ninth grade. 

'^Furthermore, 87 percent of the students in the ERO group had reading comprehension scores that were 
below grade level at the end of ninth grade. 

'“'The analysis also examines the extent to which impacts on reading comprehension test scores vary 
across schools. The impact estimates for each school range from a negative impact of 3.7 standard score points 
to a positive impact of 6.2 standard score points. Flowever, the variation in observed school-level impacts is not 
statistically significant, indicating that the observed school-to-school variation in impacts may be due to esti- 
mation error and may not traly vary across schools. 
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study. Specifically, this analysis examines whether there are differences in impacts between 
subgroups of schools defined by teachers’ experience with the ERO program (that is, schools 
whose ERO teacher taught two full years of the program versus schools whose ERO teacher did 
not teach two full years of the program); overall implementation fidelity during the spring site 
visit (that is, very well-aligned, well-aligned, moderately aligned, or poorly aligned implementa- 
tion); and the number of weeks between the start of the school year and ERO program startup 
(schools that started operating their ERO program within two weeks versus those whose pro- 
gram startup was delayed by two weeks or more). The exploratory analysis also examines 
whether there are differences in impacts between schools whose implementation of the pro- 
grams was particularly exemplary (that is, schools that started operating their programs within 
two weeks and whose implementation was very well aligned to the program models) and 
schools that did not meet these two criteria.'^ Based on these exploratory analyses, one cannot 
conclude that the programs were more effective in schools with more experienced ERO teach- 
ers, with implementation better aligned with the program models, or with early program startup. 
That is, one cannot infer with certainty that these particular implementation characteristics are 
related to program impacts because the difference in impacts between the groups of schools with- 
in each of the three measured categories of implementation — teacher experience teaching the 
ERO classes, the alignment of the programs as implemented to the program models, and the 
efficiency of program startup — is not statistically significant. Impacts for the groups of schools 
with the most promising implementation characterizations are positive and statistically signifi- 
cant (that is, for the 25 schools whose ERO teacher returned in the second year, having taught 
the entire first year of the program; the 13 schools where the ERO programs were rated as very 
well aligned to the program models; and the 23 schools where the ERO programs began within 
the first two weeks of school).''’ Impacts for the related groups of schools with less promising im- 
plementation characterizations are smaller and not statistically significant (that is, for the 9 
schools whose teachers taught ERO for less than two full years, the 21 schools where there was 
weaker implementation fidelity, and the 1 1 schools with program startup that took longer than 
two weeks). The difference in impacts between the groups of schools within each of the three 
categories of implementation is not statistically significant. 



'^It is important to note that these analyses are exploratory and are not able to establish causal links be- 
tween these aspects of implementation and variation in program impacts across sites, because other school 
characteristics and implementahon factors may confound the association between school-level impacts and the 
implementation factors included in the exploratory analysis. 

'^The impacts on reading comprehension test scores for each of these three groups of schools are as fol- 
lows: in the 25 schools whose ERO teacher had returned having taught all of the first year of the program, the 
effect size is 0.09 standard deviation (p-value = 0.050); in the 13 schools where implementation was rated as 
very well aligned to the program models, the effect size is 0. 13 standard deviation (p-value = 0.047); and in the 
23 schools where the programs began within the first two weeks of school, the effect size is 0.10 standard devi- 
ation (p-value = 0.048). 
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Next Steps for the ERO Study 

The ultimate goal of the two ERO programs is to improve students’ academic perfor- 
mance during high school and to keep them on course toward graduation. With this in mind, the 
final report from the evaluation — scheduled for 2009 — will examine the impact of the pro- 
grams on the achievement and attainment outcomes of both cohorts of students as they progress 
through high school. The outcomes examined in the report will include students’ performance in 
core academic classes, their performance on the high-stakes tests required by their states, their 
grade-to-grade promotion rates, and whether they are on track to graduate from high school. 
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